| 1 | In Fall of 2006 I did a small project on Metaobject Protocols for my |
| 2 | CS 331 class. Here lie my notes which may perhaps be useful to |
| 3 | others. I hope to expand them into something more useful over time. |
| 4 | |
| 5 | * Background |
| 6 | |
| 7 | ** Object Protocols |
| 8 | |
| 9 | An object protocol is a set of methods and specification of the |
| 10 | interactions between the methods which provide some generic behavior |
| 11 | (e.g. of a sequence) that are then implemented by classes which |
| 12 | conform to the protocol (e.g. a vector or list). In most object |
| 13 | systems a class contains both the methods which implement a protocol |
| 14 | and the data used by the implementation. The intent is to emulate |
| 15 | state machines which pass messages between each other. |
| 16 | |
| 17 | ** CLOS Way of OO |
| 18 | |
| 19 | The Common Lisp Object System (CLOS) is different. It separates |
| 20 | the data and method concepts into classes and generics. A class |
| 21 | contains data fields only, and a generic has methods specialized for |
| 22 | certain types attached to it. This seems a bit weird at first, but is |
| 23 | significantly more powerful as it encourages complete encapsulation |
| 24 | through its use of classes primarily for method specialization rather |
| 25 | than for state storage. |
| 26 | |
| 27 | *** Classes for Scratch Data and Types |
| 28 | |
| 29 | In CLOS classes store data in slots (which are the same as data |
| 30 | members). Encapsulation is not provided; any bit of code can use |
| 31 | =slot-value= to access or set the value of a slot. This may seem odd at |
| 32 | first, but encapsulation is of questionable importance as the slots |
| 33 | are meant only to be used by the protocol defined around the class. |
| 34 | |
| 35 | Classes are defined with =defclass= |
| 36 | |
| 37 | <src lang="lisp"> |
| 38 | (defclass name (superclasses ...) |
| 39 | ((slot-name :accessor slot-accessor ...) |
| 40 | ...) |
| 41 | (class-options ...)) |
| 42 | |
| 43 | (defclass example () |
| 44 | ((foo :accessor foo-of :initform 5))) |
| 45 | |
| 46 | (defclass example-child (example) |
| 47 | ((bar :accessor bar-of :initform (list 1 2 3)))) |
| 48 | </src> |
| 49 | |
| 50 | Slot definitions have several options; the above example shows only the |
| 51 | =:accessor= and =:initform= options which are the most commonly |
| 52 | used. =:accessor= generates an accessor for the slot (e.g. if you have |
| 53 | an instance of =example= you can =(setf (foo-of some-example-instance) |
| 54 | 'some-value)= to set and =(foo-of some-example-instance)= to access the |
| 55 | value). =:initform= provides a default initial value for the slot as a |
| 56 | symbolic expression to be evaluated when an instance is created in the |
| 57 | lexical environment of the class definition. |
| 58 | |
| 59 | *** Generics with Methods that Implement Protocols |
| 60 | |
| 61 | Generics are like normal functions in Lisp, but they only provide a |
| 62 | lambda list (parameter list). Methods are added to the generic which |
| 63 | specialize on the types of their parameters and provide an |
| 64 | implementation. This allows writing rich layered protocols which can |
| 65 | enable selective modification of individual facets with minimal code. |
| 66 | |
| 67 | <src lang="lisp"> |
| 68 | (defgeneric generic (parameters ...) |
| 69 | (options) ...) |
| 70 | |
| 71 | (defmethod generic-name ((parameter type) parameter ...) |
| 72 | "documentation string" |
| 73 | body) |
| 74 | |
| 75 | (defgeneric foo (bar baz quux) |
| 76 | (:documentation "Process the baz with the quux capacitor to make the |
| 77 | foo widget fly into the sky at warp speed")) |
| 78 | |
| 79 | (defmethod foo ((bar example) baz (quux capacitor)) |
| 80 | (launch bar (process-with quux baz))) |
| 81 | </src> |
| 82 | |
| 83 | A method lambda list differs from a normal lambda list only in that it |
| 84 | can specify the type of the parameter using the notation =(name type)=. |
| 85 | Note also that methods can specialize on the types of every |
| 86 | argument and not just the first one. This is quite powerful for |
| 87 | reasons outside of the scope of this presentation. |
| 88 | |
| 89 | * Limitations of Default Language Behavior |
| 90 | |
| 91 | The behavior of a language is a compromise between many competing |
| 92 | issues that attempts to be as generally useful as possible so that |
| 93 | *most* applications will have no issue with the default behavior. There |
| 94 | are, however, certain applications that could be cleanly written with |
| 95 | minor modifications to the behavior of the language, but would be |
| 96 | impossible or quite difficult to write otherwise. |
| 97 | |
| 98 | ** Slot Storage |
| 99 | |
| 100 | Most languages choose to preallocate storage for all of the slots of |
| 101 | an instance. Now imagine a contact database that stores information |
| 102 | about people in slots of a class. There may be dozens of slots, but |
| 103 | often many of them will be left blank. If slot storage is preallocated |
| 104 | much memory will be wasted and the database may not be able to fit |
| 105 | into the memory of the hardware it must run on (perhaps for financial |
| 106 | reasons, huge datasets, etc.). |
| 107 | |
| 108 | To save memory the author of the contact database must implement his |
| 109 | own system to store properties and allocate them lazily. This |
| 110 | represents a fair bit of effort, and would implement a system that |
| 111 | differed from the existing slot system of classes only regarding slot |
| 112 | storage. |
| 113 | |
| 114 | It would be useful if there were a way to customize slot allocation in |
| 115 | instances. The customizations would be minor and require overriding |
| 116 | only the initial allocation behavior and the behavior of the first |
| 117 | assignment to the slot. It is a a trivial problem in a language that |
| 118 | allows customization of these behaviors. |
| 119 | |
| 120 | ** Design Patterns |
| 121 | |
| 122 | Design Patterns are generalized versions of common patterns found in |
| 123 | programs. Many of them are merely methods to get around deficiencies |
| 124 | in the language, and can be quite messy to implement in some |
| 125 | languages. Ideally a pattern would be subsumed by the language, but |
| 126 | real world constraints require language standards to remain fairly |
| 127 | static. |
| 128 | |
| 129 | * Metasoftware |
| 130 | |
| 131 | Some types of programs could be written easily if the language were |
| 132 | customizable but are nearly impossible to write when it is not. |
| 133 | |
| 134 | ** Runtime Generated Classes |
| 135 | |
| 136 | Say you wanted to write a video game where players could create their |
| 137 | own objects, attach behaviors to the objects, and perhaps mix |
| 138 | different objects together to create new ones. When you abstract the |
| 139 | problem this looks just like an object system! Wouldn't it be nice if |
| 140 | your program could create new classes and methods on the fly portably? |
| 141 | |
| 142 | ** Object Inspection |
| 143 | |
| 144 | Imagine you were developing a complicated program with many different |
| 145 | objects that interacted in fairly complex ways. A tool to inspect the |
| 146 | structure of objects while debugging would be quite useful, but in a |
| 147 | traditional language would be impossible to implement portably. This |
| 148 | could force you to purchase a certain compiler implementation which |
| 149 | provided an inspector, and even then would likely not be customizable. |
| 150 | |
| 151 | This problem can be generalized to apply to most debugging tools; it |
| 152 | would be useful to write such tools portably because users of the |
| 153 | *language* and not the *compiler* need to debug software. Sharing |
| 154 | infrastructure would result in better tools (more developers), and |
| 155 | save the man-years of wasted effort that comes with having to rewrite |
| 156 | unportable tools from scratch multiple times. |
| 157 | |
| 158 | * Metaobject Protocols |
| 159 | |
| 160 | ** Limited/Generalized Internals of the Implementation |
| 161 | |
| 162 | A Metaobject Protocol (MOP) is a generalized and limited subset of the |
| 163 | underlying language implementation. It is limited to allow multiple |
| 164 | implementation strategies; this, along with careful design, is |
| 165 | essential because programming language research is ever advancing and |
| 166 | new techniques for creating more reliable and faster implementations |
| 167 | are still being discovered. |
| 168 | |
| 169 | This subset of the implementation is exported as a set of methods on |
| 170 | metaobjects. Thus the language is implemented in itself. The system |
| 171 | can then be customized using the extension and overriding features of |
| 172 | the language itself. |
| 173 | |
| 174 | ** Classes of MOPs |
| 175 | |
| 176 | *** Reflective |
| 177 | |
| 178 | A reflective MOP provides an interface to information *about* the |
| 179 | running system. It exposes class relationships, the methods attached |
| 180 | to a generic, etc. A reflective MOP often provides some functionality |
| 181 | for creating new classes at runtime. Smalltalk was one of the first |
| 182 | languages to expose a reflective MOP. |
| 183 | |
| 184 | **** Example: Object Inspector |
| 185 | |
| 186 | <src lang="lisp"> |
| 187 | (defgeneric example-inspect (instance) |
| 188 | (:documentation "Simple object inspector using CLOS MOP")) |
| 189 | |
| 190 | (defmethod example-inspect ((instance t)) |
| 191 | (format t "Simple Object~% Value: ~S~%" instance)) |
| 192 | |
| 193 | (defmethod example-inspect ((instance standard-object)) |
| 194 | (let ((class (class-of instance))) |
| 195 | (format t "Class: ~S, Superclasses: ~S~%" |
| 196 | (class-name class) |
| 197 | (mapcar #'class-name |
| 198 | (class-precedence-list class))) |
| 199 | (let ((slot-names (mapcar #'slot-definition-name |
| 200 | (class-slots class)))) |
| 201 | (format t "Slots: ~%~{ ~S~%~}" slot-names) |
| 202 | (inspect-loop slot-names instance #'example-inspect)))) |
| 203 | |
| 204 | (defun inspect-loop (slots instance inspector) |
| 205 | (format t "Enter slot to inspect or :pop to go up one level: ") |
| 206 | (finish-output) |
| 207 | (let* ((slot (read)) |
| 208 | (found-slot (member slot slots))) |
| 209 | (cond (found-slot |
| 210 | (funcall inspector (slot-value instance slot)) |
| 211 | (funcall inspector instance)) |
| 212 | ((eq slot :pop) t) |
| 213 | (t |
| 214 | (format t "~S is invalid. Valid slot names: ~S~%" |
| 215 | slot |
| 216 | slots) |
| 217 | (inspect-loop slots instance inspector))))) |
| 218 | </src> |
| 219 | |
| 220 | **** Example: Runtime Generated Classes and Methods |
| 221 | |
| 222 | *** Intercessory |
| 223 | |
| 224 | Intercessory MOPs allow the user to customize language behavior by |
| 225 | implementing methods which override certain aspects of the language |
| 226 | behavior. This class of MOPs are what make MOPs especially |
| 227 | powerful. No longer must a problem be restructured to fit the |
| 228 | implementation language; the underlying language can be reshaped to |
| 229 | fit the task at hand, and obfuscation of the intended structure of the |
| 230 | application can be avoided. |
| 231 | |
| 232 | **** Example: Lazily Allocated Slots |
| 233 | |
| 234 | **** Example: Observer Design Pattern |
| 235 | |
| 236 | A simple implementation of the observer pattern is under 100 lines, |
| 237 | and the user level code requires only a single line of code to make |
| 238 | any existing class observable. |
| 239 | |
| 240 | In a language lacking a MOP, implementing the observer pattern |
| 241 | requires modifying every accessor of a class to explicitly invoke any |
| 242 | observers, and necessitates the addition of a mixin class to the class |
| 243 | hierarchy. The fact that an object can be observed is a meta property |
| 244 | of the class, and forcing it to be implemented at the application |
| 245 | level dirties the inheritance hierarchy and adds unnecessary meta |
| 246 | details to the program. |
| 247 | |
| 248 | <src lang="lisp"> |
| 249 | ;;; This metaclass adds a slot to instances which use it, and so the |
| 250 | ;;; system is defined in its own package to avoid name conflicts |
| 251 | (defpackage :observer |
| 252 | (:use :cl :c2mop) |
| 253 | (:export observable register-observer unregister-observer)) |
| 254 | |
| 255 | (in-package :observer) |
| 256 | |
| 257 | ;;; Metaclass |
| 258 | (defclass observable (standard-class) |
| 259 | () |
| 260 | (:documentation "Metaclass for observable objects")) |
| 261 | |
| 262 | (defmethod compute-slots ((class observable)) |
| 263 | "Add a slot for storing observers to observable instances" |
| 264 | (cons (make-instance 'standard-effective-slot-definition |
| 265 | :name 'observers |
| 266 | :initform '(make-hash-table) |
| 267 | :initfunction #'(lambda () (make-hash-table))) |
| 268 | (call-next-method))) |
| 269 | |
| 270 | (defmethod validate-superclass ((class observable) |
| 271 | (super standard-class)) |
| 272 | t) |
| 273 | |
| 274 | (defun register-observer (instance slot-name key closure) |
| 275 | (register-observer-with-class (class-of instance) |
| 276 | instance |
| 277 | slot-name |
| 278 | key |
| 279 | closure)) |
| 280 | |
| 281 | (defun unregister-observer (instance slot-name key) |
| 282 | (unregister-observer-with-class (class-of instance) |
| 283 | instance |
| 284 | slot-name |
| 285 | key)) |
| 286 | |
| 287 | (defun get-observers (instance slot-name) |
| 288 | (get-observers-with-class (class-of instance) |
| 289 | instance |
| 290 | slot-name)) |
| 291 | |
| 292 | (defun add-observer-table (instance slot-name) |
| 293 | (setf (gethash slot-name (slot-value instance |
| 294 | 'observers)) |
| 295 | (make-hash-table))) |
| 296 | |
| 297 | (defgeneric register-observer-with-class (class instance slot-name key closure)) |
| 298 | (defgeneric unregister-observer-with-class (class |
| 299 | instance |
| 300 | slot-name |
| 301 | key)) |
| 302 | |
| 303 | (defmethod register-observer-with-class ((class observable) |
| 304 | instance |
| 305 | slot-name |
| 306 | key |
| 307 | closure) |
| 308 | (setf (gethash key |
| 309 | (or (gethash slot-name |
| 310 | (slot-value instance 'observers)) |
| 311 | ;; Lazily add observer hash tables |
| 312 | (add-observer-table instance slot-name))) |
| 313 | closure)) |
| 314 | |
| 315 | (defmethod unregister-observer-with-class ((class observable) |
| 316 | instance |
| 317 | slot-name |
| 318 | key) |
| 319 | (remhash key (gethash slot-name |
| 320 | (slot-value instance 'observers)))) |
| 321 | |
| 322 | (defmethod get-observers-with-class ((class observable) |
| 323 | instance |
| 324 | slot-name) |
| 325 | (gethash slot-name (slot-value instance 'observers))) |
| 326 | |
| 327 | (defmethod (setf slot-value-using-class) :before (new-value |
| 328 | (class observable) |
| 329 | instance |
| 330 | slot) |
| 331 | (let ((slot-name (slot-definition-name slot))) |
| 332 | (if (not (eq 'observers slot-name)) |
| 333 | (let ((observers |
| 334 | (get-observers instance (slot-definition-name slot)))) |
| 335 | (if observers |
| 336 | (maphash #'(lambda (key observer) |
| 337 | (funcall observer |
| 338 | (if (slot-boundp instance slot-name) |
| 339 | (slot-value instance slot-name) |
| 340 | nil) |
| 341 | new-value)) |
| 342 | observers)))))) |
| 343 | </src> |
| 344 | |
| 345 | |
| 346 | ** Violation of Encapsulation? |
| 347 | |
| 348 | A MOP may seem like a violation of encapsulation by revealing some |
| 349 | implementation details, but in reality a well designed protocol does |
| 350 | not reveal anything which was not already exposed. Implementation |
| 351 | decisions affect users, and some of these details do leak through to |
| 352 | higher levels (e.g. the memory layout of slots). Implicit in the |
| 353 | protocol specification are these implementation details, and the MOP |
| 354 | merely makes this limited subset available for customization. |
| 355 | |
| 356 | A MOP makes it possible to customize certain implementation decisions |
| 357 | that do not **radically** alter the behavior of the base language. The |
| 358 | conceptual vocabulary of the system retains its meaning, and so code |
| 359 | written in one dialect can interact with code written in another |
| 360 | without knowing that they speak different ones. |
| 361 | |
| 362 | * MOP Design Principles |
| 363 | |
| 364 | ** Layered Protocol |
| 365 | |
| 366 | A layered protocol design is good for both meta and normal object |
| 367 | protocols, and enables a combinatorial explosion of customizations to |
| 368 | the protocol. |
| 369 | |
| 370 | *** Top Level **Must** Call Lower Level Methods |
| 371 | |
| 372 | The top level methods of a layered protocol are required to call |
| 373 | certain lower level methods to perform some tasks. This both makes it |
| 374 | easier to customize the top level methods (which perform very broad |
| 375 | tasks) by providing some pieces of implementation for the programmer, |
| 376 | and enables more customization by opening up the replacement of lower |
| 377 | level functions as a way to alter a small detail of the high level |
| 378 | behavior. |
| 379 | |
| 380 | *** Lower Level Methods are Easier to Customize |
| 381 | |
| 382 | The lower level methods of a MOP are limited in scope and can be |
| 383 | implemented easily. Often the desired changes to language behavior are |
| 384 | minor, and having methods that perform simple tasks which are often |
| 385 | customized reduces the effort required to extend the system. |
| 386 | |
| 387 | ** Functional Where Possible |
| 388 | |
| 389 | Functional protocols are preferred for MOPs (and object protocols in |
| 390 | general). Functional protocols open up several optimizations for the |
| 391 | implementation without burdening the user of the protocol. |
| 392 | |
| 393 | *** Memoization |
| 394 | |
| 395 | Memoization is the process of saving the results of a function call |
| 396 | for future use. This avoids expensive recomputation of values which |
| 397 | have not changed (recall that a true function will always return the |
| 398 | same result when given the same arguments). |
| 399 | |
| 400 | A functional MOP can be optimized easily by exploiting this property |
| 401 | to memoize the return values of calls to expensive operations. A MOP |
| 402 | must be be very fast to avoid making programs unusably slow, and |
| 403 | memoization is able to give an appreciable speedup in many cases |
| 404 | without a significant burden on memory usage. |
| 405 | |
| 406 | *** Constant Shared Return Values |
| 407 | |
| 408 | Disallowing modification of values returned by protocol methods allows |
| 409 | the implementation to return large data structures by reference to |
| 410 | avoid expensive copying without having to do expensive data integrity |
| 411 | checks or copying. |
| 412 | |
| 413 | ** Procedural Only Where Necessary |
| 414 | |
| 415 | Some operations like method invocation are inherently stateful and so |
| 416 | must use a procedural protocol. There is no benefit to be gained from |
| 417 | using a functional protocol, and indeed an attempt would result in |
| 418 | obtuse code that severely restricted the implementation. Do note that |
| 419 | only a very small part of method invocation is stateful (the actual |
| 420 | call), and most of it can be implemented functionally (e.g. computing |
| 421 | the discriminating function). |
| 422 | |
| 423 | ** Real World |
| 424 | |
| 425 | *** [[http://common-lisp.net/project/ucw/][UCW]] and [[http://common-lisp.net/project/bese/arnesi.html][Arnesi]] |
| 426 | |
| 427 | Arnesi uses the CLOS MOP to implement methods which are transparently |
| 428 | rewritten into continuation passing style. This allows their execution |
| 429 | to be suspended at certain points and resumed later. UCW builds on top |
| 430 | of this to support a web framework where the statelessness of http is |
| 431 | hidden from the user; displaying a page suspends the execution of the |
| 432 | current continuation, and resumes it upon submission. The user level |
| 433 | code is completely unaware of this. |
| 434 | |
| 435 | *** [[http://clsql.b9.com][CLSQL]] |
| 436 | |
| 437 | CLSQL uses the reflective part of the CLOS MOP to map Common Lisp data |
| 438 | types into SQL types, and the intercessory protocol for slot |
| 439 | allocation to map slots onto database columns or sql expressions (for |
| 440 | implementing relational slots). |
| 441 | |
| 442 | *** [[http://common-lisp.net/project/elephant/][Elephant]] |
| 443 | |
| 444 | Elephant uses the CLOS MOP to transparently store any class to disk |
| 445 | and handle paging between the disk store and memory efficiently |
| 446 | without user intervention. |
| 447 | |
| 448 | * Sources and Further Reading |
| 449 | |
| 450 | ** Sources |
| 451 | |
| 452 | *** The Art of the Metaobject Protocol |
| 453 | |
| 454 | **** Kiczales, Gregor et al. MIT Press 1991 |
| 455 | |
| 456 | Highly recommended reading even if you plan to never implement a MOP |
| 457 | or use the CLOS one. The design principles it recommends are quite |
| 458 | useful. |
| 459 | |
| 460 | *** [[http://www.lisp.org/mop/contents.html][CLOS MOP Specification]] |
| 461 | |
| 462 | Specification of the MOP for CLOS defined in *The Art of the Metaobject Protocol*. |
| 463 | |
| 464 | *** [[http://citeseer.ist.psu.edu/399658.html][Metaobject Protocols: Why We Want Them and What Else They Can Do]] |
| 465 | |
| 466 | A short overview of MOP design principles followed by three example |
| 467 | metaobject protocols for Scheme. |
| 468 | |
| 469 | *** [[http://www2.parc.com/csl/groups/sda/projects/oi/towards-talk/transcript.html][Why Are Black Boxes so Hard to Reuse?]] |
| 470 | |
| 471 | Transcription of a talk on the benefits of open implementations of |
| 472 | software. It first discusses several problems that black box software |
| 473 | implementations pose, and then presents existing solutions. It shows |
| 474 | how the existing solutions are insufficient, and then provides |
| 475 | metaobject protocols as a solution to most of the problems. |
| 476 | |
| 477 | ** Further Reading |
| 478 | |
| 479 | *** [[http://citeseer.ist.psu.edu/chiba95metaobject.html][A Metaobject Protocol for C++]] |
| 480 | |
| 481 | Example of a purely compile time MOP. It implements the functionality |
| 482 | of a code walker and something similar to the Lisp macro system. |
| 483 | |
| 484 | *** [[http://www.parc.com/csl/groups/sda/publications/papers/Kiczales-TUT95/for-web.pdf][Open Implementations and Metaobject Protocols]] |
| 485 | |
| 486 | It is a bit long, but it seems to follow a similar structure to AMOP |
| 487 | in introducing MOPs and their usefulness. The pages are slides with |
| 488 | notes, and so the 331 pages might not actually take that long to read. |
| 489 | |
| 490 | ** Software |
| 491 | |
| 492 | *** [[http://common-lisp.net/project/closer/closer-mop.html][Closer to MOP]] |
| 493 | |
| 494 | Compatibility layer that attempts to present the *Art of the Metaobject |
| 495 | Protocol* MOP specification properly in as many Common Lisp |
| 496 | implementation as possible. |